Conversational spoken dialogue systems that interact with the user rather than merely\nreading the text can be equipped with hesitations to manage dialogue flow and user attention.\nBased on a series of empirical studies, we elaborated a hesitation synthesis strategy for dialogue\nsystems, which inserts hesitations of a scalable extent wherever needed in the ongoing utterance.\nPreviously, evaluations of hesitation systems have shown that synthesis quality is affected negatively\nby hesitations, but that they result in improvements of interaction quality. We argue that due to its\nconversational nature, hesitation synthesis needs interactive evaluation rather than traditional mean\nopinion score (MOS)-based questionnaires. To validate this claim, we dually evaluate our system�s\nspeech synthesis component, on the one hand, linked to the dialogue system evaluation, and on the\nother hand, in a traditional MOS way. We are thus able to analyze and discuss differences that arise\ndue to the evaluation methodology. Our results suggest that MOS scales are not sufficient to assess\nspeech synthesis quality, leading to implications for future research that are discussed in this paper.\nFurthermore, our results indicate that synthetic hesitations are able to increase task performance and\nthat an elaborated hesitation strategy is necessary to avoid likability issues.
Loading....